Propagation dynamics on networks featuring complex topologies 
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Analytical description of propagation phenomena on random networks has flourished in recent 
years, yet more complex systems have mainly been studied through numerical means. In this 
paper, a mean-field description is used to coherently couple the dynamics of the network elements 
(nodes, vertices, individuals...) on the one hand and their recurrent topological patterns (subgraphs, 
groups...) on the other hand. In a SIS model of epidemic spread on social networks with community 
structure, this approach yields a set of ODEs for the time evolution of the system, as well as 
analytical solutions for the epidemic threshold and equilibria. The results obtained are in good 
agreement with numerical simulations and reproduce random networks behavior in the appropriate 
limits which highlights the influence of topology on the processes. Finally, it is demonstrated that, 
in the absence of degree correlation, our model predicts higher epidemic thresholds for clustered 
structures than for equivalent random topologies. 

PACS numbers: 89.75.Hc, 87.23.Ge, 89.75.Kd 
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I. INTRODUCTION 

Description of propagation phenomena has been one of 
the most prolific field in complex network theory, mostly 
because of the range of possible applications: epidemic 
control, spread of information, virus or pollutant propa- 
gation in electronic or biological networks [1 . Most ana- 
lytical models are based on the random network (RN) 
paradigm: from the point of view of the propagating 
agent, random networks are seen as identical for every 
newly infected individual because of their treelike struc- 
ture (i.e. no loops). This approach has given rise to 
different descriptions: some are based on a compartmen- 
talisation of nodes according to their state [2] , others on 
the generating function formalism [SHE] or hybrid descrip- 
tions using mean- field theory [7J |S]; yet all approaches are 
difficult to generalize to real networks for which the RN 
paradigm rarely applies. 

The importance of topology for propagation dynam- 
ics [U I9til3] , and more specifically, the importance of 
clustering [THU9J . is now well established. That is, the 
dynamics on the network is far from independent on how 
links are arranged between its elements. Furthermore, 
most real networks feature a significant amount of sub- 
structures that simply cannot be ignored as they define 
the very identity of the networks. The multi-protein units 
of molecular biology [HI HI], the coupling of a given set of 
stocks [22j [23] or groups of highly connected individuals 
[PUl |2~3] are all good examples of how precise mechanisms 
(e.g. the friend of my friend is my friend) give rise to im- 
portant structures within a seemingly random topology. 

The two limits of complex networks, complete random- 
ness and perfect order, can be treated with the previously 
discussed methods. We will concentrate on those partic- 
ular complex networks, located somewhere between order 
and disorder, and show how their topology can be taken 
into account in dynamical problems. In doing so, the lan- 
guage of social networks and epidemics will be used to 



take advantage of its eloquence and clarity. It should be 
clear however that the formalism developped is general 
to many types of networks and propagation phenomena. 
The paper is structured as follows. The particular 
topology chosen to illustrate our approach, the commu- 
nity structure (CS), is described in Sec. Ill] The analytical 
model is then developed in Sec. |III| where we also ob- 
tain analytical solutions for the equilibria and epidemic 
threshold of the system. Section |IV| compares our an- 
alytical results with numerical Monte-Carlo (MC) sim- 
ulations and presents discussions of our findings. After 
presenting our conclusions in Sec. [V] an Appendix com- 
pletes our analysis of propagation phenomena on com- 
munity structure. 



II. COMMUNITY STRUCTURE 

In what follows, a new approach to describe dynamical 
problems on complex topologies will be used to solve a 
disease propagation model on social networks featuring 
a well-known topology: the community structure. We 
define this particular arrangement of nodes by their ag- 
gregation in highly connected groups. These communi- 
ties (or cliques) can virtually represent a person's family, 
workplace, collection of friends, etc. This simple concept 
results in a network with highly connected communities 
and a sparser density of links between them (see Fig. 
[l}. The topology of such networks has been studied at 
some length: for its initial description, see [35]; for its 
statistical significance, see [55]; for its detection or char- 
acterisation, see [2TH3T] : and the references therein for 
an exhaustive presentation. 

Unfortunately, not unlike other complex types of net- 
works, studies of dynamical processes on this topology 
has been mainly limited to numerical simulations (e.g. 
[32]). Albeit useful to estimate its effect on the dy- 
namics, they lack the clarity of an analytical framework. 
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FIG. 1. (Color online) Schematization of the particular topol- 
ogy studied in this paper. An open mark represents a suscep- 
tible individual; a shaded one, an infectious; and a black one, 
a group (or clique). The topology is constructed by allow- 
ing individuals to belong to a given number of cliques where 
they can be linked to other participants (solid lines) and then 
randomly assigning random exterior neighbors (dotted lines). 
Note that in the formalism, the cliques are differentiable by 
their exact population and state, while the precise connec- 
tions between them remain unspecified and they are simply 
linked to a mean-field. 



On the other hand, mean-field description of propaga- 
tion phenomena in terms of communities (or households) 
has been previously attempted in 33 35j with several 
shortcomings such as homogeneous topology, lack of the 
concept of individuals or inefficient moment closure ap- 
proximations. Hence, there is a need for an analytical 
approach that can accurately take into account the many 
complexities of social networks in order to describe the 
time evolution of the system. Because community struc- 
ture typically includes clustering and degree correlation, 
our formalism will include the coherent contribution of 
both properties. 



A useful model of social topology was published by 
Newman in |18) . The networks are constructed as fol- 
lows: each individual belongs to m cliques and each clique 
holds n individuals, where both m and n are taken from 
given distributions. Within every clique, each pair of 
members has a probability e of being acquainted. Hence, 
the entire topology is defined by one parameter e and two 
probability distributions {g m } and {p n } generated by the 



following probability generating functions (PGFs): 
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Gq(z) = ]T g m z m . (2) 

m=0 

which are simply built from the probabilities p n and g m 
that a random clique or individual will have n partici- 
pants or m cliques respectively. Similar functions can be 
denned to generate the probabilities that a random clique 
of a random individual is shared by n — 1 other partic- 
ipants or that a random individual in a random clique 
participates in m — 1 other cliques. We simply note that 
these quantities are proportional to np n or mg m and thus 
find our second set of PGFs: 
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where v and /i are respectively the mean numbers of in- 
dividuals per clique and cliques per individual used to 
normalize the distributions. Note that the mean of a dis- 
tributed quantity is simply given by the derivative of the 
corresponding PGF evaluated at unity. The following 
topological properties have already been derived in |18j 
and [TS]: degree distribution, size of the giant compo- 
nent, clustering coefficient and degree correlation. Some 
of these results are used throughout this paper. 

Newman's model, although realistic because of its over- 
lapping communities, is strongly limited since links only 
arise through communities. A node belonging to a sin- 
gle clique does not participate at all in the coupling, 
while a node belonging to two cliques or more will have a 
huge influence. Hence, it is hard to describe weakly cou- 
pled communities of significant sizes using this particular 
topology. Consequently, we will introduce a more general 
description of community structure where exterior ran- 
dom links are also allowed. We simply add a distribution 
for the number of random links per individual, which is 
generated by: 
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Our networks will thus be defined by the e probability 
and three distributions for the numbers of individuals 
per clique (fTl) , cliques per individual (pi) and random links 
per individual pj. Intuition indicates that a large num- 
ber of networks can be decomposed as basic structures 
coupled either by sharing nodes, by forced connections or 
a combination of both. In fact, many of the previously 
cited papers study networks where nodes belong to a sin- 
gle clique coupled only by random links with the outside 
world (e.g. JT5], HH])- Our general model includes this 
topology and Newman's original model as special cases. 



III. SIS MODEL OF DISEASE PROPAGATION 
ON COMMUNITY STRUCTURE 

A. Construction of the dynamical model 



to at least one clique. We can directly write: 
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The philosophy behind our formalism is to analyze the 
network simultaneously from two perspectives, i.e. the 
state of the network is followed from the point of view 
of recurrent patterns in its topology and of the elements 
themselves. More precisely, we compartmentalize both 
the structure and the node ensemble in terms of their re- 
lation to one another and couple the two systems to give 
a complete description of the propagation phenomenon. 
For social networks featuring community structure, the 
recurrent patterns are cliques of individuals that can be 
distinguished by their size and their state. The elements 
are individuals distinguishable by the number of cliques 
to which they belong and by their number of exterior 
random links. That is, the mean state of a given class of 
individuals will act as if all of their cliques and random 
links were approximated by a mean-field and the mean 
state of a given class of cliques will act as if all individu- 
als were also reduced to a mean-field approximation. The 
behaviors of both cliques and individuals are coupled in 
terms of their connections via the generating functions 
^ through ^. 

The particular case under study is a Susceptible- 
Infectious-Susceptiblc (SIS) model of disease propaga- 
tion. In continuous time, an infectious node may pass 
the disease to any of its susceptible neighbors at a rate r 
(S — > I), while it is recovering from the disease at a rate 
r (I — > S). Given initial conditions, we are interested in 
developping a system of equations capable of following 
the state I(t) of the network, where I(t) is the fraction 
of infectious individuals at a given time. According to 
our philosophy, we thus need to follow both individuals 
and cliques. Let S m j(t) be the proportion of individu- 
als which belong to m cliques, have I random links and 
are susceptible at time t and C n ^(t) be the proportion of 
cliques whose population is n and of which i are infectious 
at time t. For the sake of clarity, we will not explicitly 
mark the time dependence, (t), when it is obvious that 
the quantity is a dynamical variable. 

First, we need to describe how the generating functions 
Gi(z), Kq(z) and P\(z) will differ depending on the state 
of the involved individual. To define the dynamical gen- 
erating functions, it is possible to either follow the distri- 
butions for the susceptibles or the infectious individuals, 
since S m ,i +I m ,i = 9mki- We will follow the susceptibles. 
We then need the distribution of cliques reached from a 
susceptible individual of a given clique. This distribu- 
tion will be affected by {S m ^} in the following manner: 
a random individual has probability mg m of belonging 
to (m — 1) other cliques, but consequently, only a prob- 
ability J2i S m .i/g m of being susceptible at time t. The 
reasoning is even simpler for K (z) as the distribution is 
not affected by the knowledge that the individual belongs 
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where the tilde denotes that the function generates a dis- 
tribution which applies to susceptible individuals only. In 
a similar fashion, the knowledge that a clique is reached 
by a link emerging of a susceptible individual will affect 
the distribution of this clique's number of susceptible in- 
dividuals. The probability that a susceptible individual 
belongs to a clique of state {n, i} is directly proportional 
to the number of susceptible members of that particular 
state. In order to consider only susceptibles individuals, 
the P± (x, y) generating function must be modified accord- 
ingly to the number of susceptible members belonging to 
each compartment: 
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Four interesting and important quantities can be derived 
from these dynamical generating functions. Firstly, the 
average number of infectious neighbors per clique and per 
random link for a susceptible individual, R(t) and T(t): 



R(t) 



T{t) 
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Secondly, the mean number of excess infectious neighbors 
per clique and per random link for a susceptible individ- 
ual of a given clique, p(t) and <r(i): 



p(t) = G[(l;t)R(t) , 
a(t) = K' (l;t)T(t) 



(11) 
(12) 



where the primes denote a derivative with respect to z, 
so that G[(l;t) is the average number of outside cliques 
for a susceptible member of a given clique at time t. 

Let us now construct the differential equation govern- 
ing {S m j}. We previously mentionned that the disease 
spreads through any link between a susceptible and an 
infectious individual. Thus, with R(t) being the average 
number of such links that a susceptible may have in a 
single clique, the rate at which the class of individuals 
belonging to m cliques is infected, is proportionnal to 
—TmS m iR(t). Similarly, with T(t) being the probabil- 
ity that a random link leads to an infectious individual, 
the rate of infection for individuals with I random links 
must be proportionnal to —TlS m iT(t). Simultaneously, 
the same ratio increases as the infected nodes recover at 
a speed r(g m ki — S m ,l)- Therefore, the set of equations 



governing the point of view of the individuals is simply 
obtained by summing the contributions from these three 
processes: 



dS, 



i n .1 



dt 



= r(g m ki ~ S m ,i) - rS„ lt i [mR(t) + lT(t)] . (13) 



Similar considerations are needed to define the dynamics 
of the C Ui i values. A clique in a {n,i} state can either 
pass to {n, i + 1} by infection (if i < n) or to {n, % — 1} 
by recovery (if i > 0). The first process is proportion- 
nal to the sum of the number of links between infectious 
and susceptible individuals within the cliques and the 
number of links with infectious neighbors that each sus- 
ceptible might have outside the considered clique. For a 
given {n, i} compartment, infection can either bring new 
cliques from the {n,i — 1} state or cause the cliques to 
pass to the more infectious {n,i + 1} compartment: 



dC n _, 
dt 



oc r (n - i + 1) [e (i - 1) + p(t) + cr(t)) C„, 
-r(n-i) [ei + p(t) + a(t)} C n>i . 
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The contribution of the recovery process is easy to ex- 
plicit using the same logic, as it is simply proportionnal 
to the number of infectious individuals who might re- 
cover: 



dC-n,. 
dt 



oc r (i + 1) C n ,i+i - riC n . 



(15) 



Summing the contributions of both the infections ( 14 ) 



and the recoveries ( 15 ) yields the desired differential 



equation 


for the cliques dynamics: 








dty n ,i 

dt 


= r(i + l) C n ,i+i - riC n 
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where C' T 


L i is defined only for i € 


[o,n]. 


Coupled with 


Eq. (131, 


we now have a complete 


dynamical system for 



the state of the network in a SIS model of disease spread. 
If desired, the mean fraction of infectious individuals 
of a given class of cliques can be obtained in a straight- 
forward manner with: 



^ np n 



(17) 



It is generally simpler to caracterize the state of the net- 
work via the total fraction of infectious, I(t), or suscep- 
tible, S(t), individuals. From Eq. |l3"|), we directly have: 



S(t) = J2Sm,l\ I® = 2(1 - 5 m ,j) . 



(18) 
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Note that a straightforward evaluation of the global state 
of the network from {C n ^} would be biased because an 
individual belonging to m cliques would be counted m 
times more than an individual participating to a single 
clique. 



B. Solution for network stable state 



System ( 13 ) and ( 16 ) can be solved as a traditional self- 



consistent field by looking for a solution in terms of p and 
a. Using Eq. (16) for the C n ^ quantities at equilibrium 
(i.e. dC nt i/dt = 0), we obtain the following recursive 
solution: 



a 
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71,2 + 1 
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^ [(/„,, + n) C; M - /fM-xC^J (19) 



with C ny i = V i ^ [0, n], and where we introduce a 
matrix of infection {f n ,i} whose elements depend on the 
total mean-field £: 

f nti = T (n-i)(ie + e) (20) 

T = P* + a* . (21) 

Asterisks will hereafter refer to values at equilibrium. 



Equation ( 19 ) can be used to fix the stable values of 
all the C* i relative to C* , which can then be solved 
(14) exactly by applying the following topological constraint: 



Or, 



V t.n 



(22) 



Using the equilibrium condition on Eq. ( 13 ) provides a 
direct solution for the S* , ensemble: 



n* 

o,, 



rg m h 



t (mR* + IT*) 



(23) 



It is then possible to write R* , T* , GUz) and K^ (z) in 
terms of p* and a* by using ( 19 ) in (p]) and ( 10 ) while 



using (23 1 in (fGj) and ¥i\. A transcendental equation is 
obtained for £* by writing (fTTl) and ( 12 ) as: 
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where the dependence on £* comes from that of {S^ t } on 
R* and T* written in terms of {C* ^} which are a direct 
function of £*. Solving for ^* yields a unique non-zero 
solution fixing {C* { } which in turn provide the values 
for R* and T* . This directly fixes {S^} using pjb, and 
thus the stable state of the network defined by ( | 1 8[ ) . 

Clearly the dynamics is governed by the ratio A = r/r 
and not the individual rates. Therefore, under the trans- 
formation to the normalized propagation rate A, our 
model admits a single independent parameter in its dy- 
namics. 



C. Solution for epidemic threshold 

The epidemic threshold A c is defined by a phase transi- 
tion in the normalized infection rate where a macroscopic 
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FIG. 2. (Color online) Function F(£*) is shown in shade on 
the topology defined in ( 32 1 for two different normalized prop- 



agations rates: A = 0.02 in dotted line (under the threshold; 
no solution for £* > 0) and A = 0.1 in solid line (epidemic). 
The black solid line is the curve of slope 1, P(£*) = £*. 



final epidemic size first appears. Here, it can be defined 
mathematically using the analytic solution for the sta- 
ble state of the SIS epidemic. Equation (24 1 behaves as 
shown in Fig. [2] with a trivial solution at £* = and 
another possible solution £* > depending on A and 
the topology. Since F(£*) is a monotonously increasing 
function, A c can be found by the following condition: 



r/e 



F(C 



(25) 



e=o 



For initial derivative value above unity, a solution £* > 
exists and the stable epidemic state is non-zero (Fig. |2J). 
For a system subject to a propagation at its threshold, 
by definition, we know that the stable state is the trivial 
solution C* ni = p n dio V {n, i} and S^j = 9mh V {m, 1} 

(which implies G\(z;i) = G\(z) and Ko(z;t) — Kq(z)). 
It follows that the mean-field values are zero at equilib- 



rium and (25) straightforwardly becomes: 



l£{ez(n-*)GUl) + «l)}A<7* 



1. (26) 
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Using ( 19 ) to evaluate the derivative at equilibrium, one 



finds that V i > 0: 

± c * 

df* n ' 1 
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(27) 



Using ( 27 ) to solve ( 26 ) for A c provides a polynomial with 



positive coefficients for terms of order one or more: 
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,1 



n,i>0 



If. 



K' {1) 



E^^T^T [(n-i)G' 1 (l) + ^^ ) = 1 . (28) 



This polynomial therefore has a single real positive so- 
lution, which is the epidemic threshold of the network. 



FIG. 3. (Color online) Degree distribution in the infinite net- 
work limit of the chosen topologies: (331 is shown by a solid 
shaded line while ( 35 I is shown by a dotted black line. Note 



the periodic local maxima corresponding to each m value. 

For random networks, one can set K' (l) = 0, e = 1 and 
Pn = 5 n 2 so that all links are shared within cliques of 



size two. Expression (281 then reduces to: 



g;(i)a« n = i 



(29) 



where G'i(l) is here the mean excess degree. From Eq. 
(291, one can deduce that our model predicts a null SIS 



epidemic threshold only if G[(l) diverges. For scale-free 
networks whose degree distribution falls as k~ s , it can be 
shown that G'^l) diverges if s < 3. Our model therefore 
leads to the same conclusion as }5B]: scale-free networks 
with degree distribution pk oc k~ s and s < 3 are defined 
by an absence of epidemic threshold. 

A calculation of the SIS epidemic threshold on random 
networks was previously done in [37] , using discrete time 
steps and constant recovery period approximations. To 
the best of our knowledge, Eq. ( 28 ) is the first equation 
for a continous time SIS model of epidemic spread for 
both random networks and community structure. 



IV. IMPLEMENTATION AND VALIDATION 

A. Treatment of the analytical model 

In order to highlight the difference between RN and 
CS, both types of networks will be studied analytically 
and numerically. The CS network will be compared with 
its equivalent random network (ERN): a network with 
exactly the same degree distribution, but with randomly 
connected nodes (zero degree correlation). Note that on 
our general model of community structure, the PGF for 
the degree distribution is simply generated by |18j : 



G (Pi(l + (z-l)e))x# (z) 



(30) 



To describe an ERN with this distribution, two simple 
options are available. Firstly, one can set P(f RN (z) = z 2 
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FIG. 4. (Color online) Comparisons of analytical and numerical results on a network defined by (132]) using normalized dynamics 
(i — > rt and A = r/r). (a) analytical stable states (curves) and epidemic thresholds (vertical lines at Aj? = 3.54 • fO -2 and 
Aj? RN = 3.44 • 10 -2 ). (b) time evolution (curves) and analytical equilibrium (horizontal line) for A = 0.5. On both figures, the 
results are shown in solid shade for the community structure (CS) and in dotted black line for the equivalent random network 
(ERN). Numerical results are presented by markers and are averaged over 20 000 networks of 25 000 nodes. The standard 
deviation is smaller than the marker size. 



and _ftT ERN (z) = 1 with g ERN — 1 so that all cliques are 
of size two (i.e. regular links) and then choose the 



distribution equal to the initial degree distribution (30) 
of the CS network. Secondly, one can set Pq RN (z) = z 
and G ERN (z) = z with any e ERN so that all cliques are 
of size one (i.e. simple nodes) and then choose the ki 



distribution equal to the initial degree distribution ( 30 1 . 
Both will be used in what follows. 

The time evolution of the analytical system is obtained 
from an integration based on a 4th order Runge-Kutta 
algorithm with adaptive time steps. The initial condition 
1(0) is uniformly distributed among the nodes. That is, 
5^1(0) = g m h (1 - 1(0)) for all {m,l}, while {C„,,(0)} 
are given by a simple Bernoulli trial: 



Cn,i(0) =p„ 



w [i-/(o)] 



(31) 



B. Numerical model 

To perform MC simulations of the model, we have gen- 
erated networks with the structure presented in section 
|TT| via the following numerical algorithm: 

i. generate a sequence {m t } of length N subjected to 
distribution {g m }', 

ii. generate a sequence {rij} subjected to distribution 
{p n } until J^j n j = J2i m i\ 

iii. for each i, produce Wj individuals tagged as i; 

iv. for each j, produce rij groups tagged as j; 



v. randomly assign each individual to a group; 

vi. for each t, list every i assigned to the rij groups and 
link them to one another with probability e. 

vii. generate a sequence {l s } of length TV subjected to 
the distribution {k{\ under condition that J^ s l s is 
even; 

viii. for each s, produce ^ s stubs tagged as s; 

ix. randomly link all stubs in pairs. 



The final ensemble of links presents a topology as shown 
in Fig. [ljwith a degree distribution generated by (30); 
where nodes are highly clustered, but the clique concept 
itself is invisible. Each and every network generated by 
this procedure is accepted and kept in the results, as 
they are part of the canonical ensemble considered by the 
mean-field approach of the formalism. For every gener- 
ated network, a fraction 1(0) of individuals are randomly 
chosen to be initially infectious and the dynamics is then 
simulated in a discrete time propagation simulation valid 
for a time step At — > (we choose At such that r At and 
rAt are lesser than 10 -3 ): 



i. at each At, every susceptible neighbor of every in- 
fectious individual is infected with probability rAt; 

ii. at each At every infectious individual recovers with 
probability rAt. 

Finally, for each constructed network, the final degree 
distribution is used to generate an ERN for comparison. 



1 


■ i i i 


0.8 


,B A 








■ 


0.6 










■ 


0.4 


pp 








■ 


0.2 
' 






cs 

ERN 


o 


■ 


J 0.5 


1 


1.5 


2 


2 








t 







(a) 




FIG. 5. (Color online) Comparison between analytical and numerical results on a network with general community structure 
defined by ( |34| for a SIS model of propagation dynamics of parameter A = r/r — 0.5 under normalized time t — > rt. (a) time 
evolution of the global state (community structure in solid shade and equivalent random network in dotted black) and (b) 
time evolution for cliques of size 10, 15, 20, 25, 30 and 35 (lowest to highest curves). All numerical results are obtained via 
MC simulations on over 20000 networks of 25000 nodes and are presented by their mean value. Analytical predictions for the 
stables states are shown in horizontal dotted lines in both figures. Note that the deviation from the predictions is bigger for 
the smallest cliques than for the larger ones. This is a consequence of the mean-field description which is more accurate for 
large systems (or, in this case, subsystems) for which standard deviations are of lesser relative importance. 



C. Results on Newman's topology 

The first topology chosen to test the formalism is the 
special model presented in [18] , which does not allow ran- 
dom links and is thus obtained by setting K (z) = 1 
(i.e. all links are shared within a clique). We will then 
use e = 0.8, a power-law distribution for the numbers of 
cliques per individual and a Poisson distribution for the 
numbers of individuals per clique: 



g m ex to 



i/1.2 



20" 

Pn oc — -e 
nl 



(32) 



This topology results in a degree distribution generated 
by the following function: 



G (Pi(l + (z-l)e)) 



ln(1 _ e 20(e,- e ) e -5/ 6) 

ln(l - e- 5 /6) 



(33) 



This heterogenous distribution is shown in Fig. [3j To 
follow the propagation dynamics on an ERN, we use the 
first of the two options previously presented: all cliques 
are of size two with e ERN — l and a distribution {g m } 



equivalent to ( 33 1 . 

Our results on this topology, Fig. HI confirm that our 
formalism is indeed capable of following the time evolu- 
tion of the network in structured and random topologies. 
Furthermore, both our numerical and analytical results 
support the conclusions of [32, 35, 38 as will be discussed 
below. 

Firstly, as evident in Fig. 



4(a) the community struc- 



system. This conclusion is only valid when the giant com- 
ponents of CS and of the ERN have approximately the 
same size and under condition that the network is well 
connected. In physical terms, this means that the cou- 
pling must be sufficiently high between the subsystems, 
relative to the strength of the interaction (i.e. A). If this 
condition is not fully met, subsets of the canonical dis- 
tribution of configurations (i.e. ones with higher number 
of independent cliques) will have stable states under the 
predicted value and will decrease the mean value. The 
reduction of the giant component was already explained 
in [IS]. This effect is visible in both analytical and nu- 
merical results of Fig. 4(a) for lower infection rate and 



ture does not significantly change the stable state of the 



eventually leads to a higher epidemic threshold for net- 
works with community structure. 

This particular property seems to contradict a ma- 
jor conclusion of [18], yet it is important to take into 
account that the conclusion that clustering lowers the 
epidemic threshold was made on networks featuring dif- 
ferent degree distributions (see [35] for a complete dis- 
cussion) and featuring degree correlation (see [14] for an 
analysis of correlation and clustering effects). Our re- 
sults show that, given an identical degree distribution 
and zero degree correlation, the random networks will 
have a lower epidemic threshold than a network featur- 
ing community structure. This conclusion is intuitive be- 
cause links shared in community have a higher probabil- 
ity of being "wasted" (i.e. of leading to another infectious 
node) than a random link, independently of the trans- 
missibility. The mechanism behind this phenomenon is 
simple: there is a higher probability that neighbors of a 



new infectious individual will also be infectious if these 
individuals are connected in groups. This leads to a lower 
mean epidemic size for low infection rate and to the ob- 
served higher epidemic threshold. Note that, within the 
community structure effects observed here, the individ- 
ual effects of clustering and degree correlation can not 
be separated. The demonstration given in Appendix 
shows that, for networks with zero degree correlation, 
our model always predicts a higher epidemic threshold 
for networks with clustering than for equivalent random 
networks. However, it should be emphasized that corre- 
lation effects alone have been shown to lower the perco- 
lation threshold [30] • As similar effects can take place 
on networks with community structure, our conclusion 
is not directly generalizable to networks with non-zero 
degree correlation. 

Secondly, as seen in Fig. |4(b)| the community struc- 
ture increases the relaxation time of the system; i.e. it 
slows the disease propagation towards the equilibrium. 
This phenomenon is also explained by the higher num- 
ber of wasted links on a community structure than on 
the equivalent random network. These links are very 
frequent in social networks because of community struc- 
ture where "the friend of my friend is also my friend" . 
When counting new possible infections on networks with 
exactly the same degree distribution, the number of sec- 
ond neighbors will be higher in a random network than 
on a community structure, because the neighbors of my 
neighbor may have already been counted as my neighbor 
in the CS network. This results in a slower propagation 
and a typically higher epidemic threshold. 

Finally, note that the shift observed in the epidemic 
threshold is not always as small as seen on Fig. 4(a)| For 
example, a topology with G"(l) ~ 0.365 and v — 5 yields 
Aj? s = 5/4 • Af RN . This particular case was verified by 
MC simulations. 



D. Results on a general topology 

As a second test to our formalism, we use e = 0.8 and 
the following distributions: 
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(34) 



which result in the second degree distribution shown in 
Fig. [3] and generated by: 



ln(l - e 20(^-<O e - 5 / 6 ) ln(l - ze- 1 ) 
ln(l-e- 5 / 6 ) lntl-e- 1 ) 



(35) 



In this case, the ERN are obtained by using cliques of size 
one and fitting the degree distribution with the random 
links generated by K (z). The results obtained on this 
second topology are presented in Fig. [5] They not only 
confirm the quality of our treatment, but also earlier con- 
clusions. The propagation slow-down is stronger in the 
time evolution featured in Fig. 5(a) than in the case ob- 
served in Fig. |4(b)| because the topology used produces 



a much higher proportion of intra-clique links for a given 
individual, and consequently, a higher fraction of wasted 
links. It is believed that this effect could be studied us- 
ing percolation theory with a quantification of CS, such 
as the modularity concept introduced by Newman and 
Girvan in 1291. 



CONCLUSION 



What may well be the single most important contri- 
bution of this paper is the philosophy upon which the 
formalism is based. An effective dynamical description 
of complex networks can be obtained by a mean-field 
approach using a compartmentalisation of both the net- 
works' elements (e.g. individuals or nodes) and of their 
recurrent topological patterns (e.g. cliques or substruc- 
tures) in classes of homogeneous state and behavior. It 
has been shown that a particular topology, the commu- 
nity structure, can be solved with this method. Fur- 
thermore, the approach can also describe random topol- 
ogy in the limit of the most elementary patterns possi- 
ble. Hence, it is reasonable to assert that other complex 
topologies may be treated in a similar manner. 

More precisely, our analytical results confirm previous 
numerical simulations on the effects of community struc- 
ture in propagation dynamics: in comparison to equiv- 
alent random networks, the structured systems feature 
longer relaxation times (i.e. slower propagation) and gen- 
erally higher epidemic thresholds. 

An especially interesting avenue to explore would be 
to direct the formalism towards more epidemiologically 
oriented applications with a generalization to other prop- 
agation model (see for example [41]). Furthermore, in an 
epidemic context, taking the topology of social network 
into account allows precise emulation of real interven- 
tion scenarios which are often based on groups of indi- 
viduals (e.g. school closings and vaccination of public 
health workers both correspond to interventions on given 
cliques) . 

Other applications of our formalism are possible in var- 
ious models of dynamics and topologies. Of particular 
interest is the application of our formalism to dynamical 
networks (e.g. [J2H32])- This may help in gaining insights 
on the emergence and the stability of social structure. 
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Appendix: Community structure, without degree 
correlation, raises the epidemic threshold 

This paper has shown that our model can describe 
propagation phenomena on network with community 
structure as well as network with random topology. Us- 
ing the analytic solution for the epidemic threshold on 
Newman's topology, it is possible to show that, given 
two networks with identical degree distributions and zero 
degree correlation, but where one is completely random 
while the other features community structure (and there- 
fore clustering), the latter will have a higher epidemic 
threshold. 

First of all, degree correlation refers to situations 
where, given a random link in the network, the knowl- 
edge of the excess degree of one of its nodes influences 
the probability distribution for the excess degree of the 
other. For Newman's model, it was shown in [19] that 
the probability ejk that a given link joins two nodes of 
excess degree j and k can be calculated as follows. We 
first write: 



e i k = n E p ""( n ~ 1 ) P 0' fc l ? 



(A.l) 



where n(n — 1) is the number of potential degrees in a 
clique of size n, N is a normalization factor correspond- 
ing to the total number of potential links in the network 
and P(J, k\n) is the probability that a link within a clique 
of size n joins two nodes of excess degree j and k. This 
probability can be calculated by separating j in j m and 
jout, respectively the excess links shared within and out- 
side of the considered clique, and doing the same for k. 
We can now write: 

P(j, k\n) =J2 (^~ V^ 1 e r^"P(jout) 



E 
fen. 



n-2 



(1 - e) n - 2 - k -P(k out ) , (A.2) 



where P(jout) and P(k ou t) are the probabilities that the 
nodes have j ou t and k on t links outside the clique of size 
n. These two probabilities are simply generated by the 
PGFs composition G\ (Pi(l + {z — l)e)). Now, because 
both k and j must be calculated with one clique in com- 
mon where they both have n — 2 potential excess neigh- 
bors, we can write the set of {ejk} in terms of the follow- 
ing PGF: 

]T e jk x*y h =P 2 ((1 + (x - l)e)(l + (y - l)e)) 
j'fe 

xGi(P 1 (H-(x-l)e))G 1 (P 1 (l + (y-l)e)) , (A3) 

where P 2 (z) = [Pq'(1)]~ Hn n ( n ^ l)z n ~ 2 . For a random 
network, it is easily obtained that ejk is simply the prod- 
uct of the two independent probabilities of having nodes 
of excess degree j and k. Thus, by differentiating the 



degree distribution PGF ( 30 1 to obtain the excess degree 
distribution, we find: 

J2efk RN x 3 V k =P» (l+(x-l)e)G! (P x (l+(z-l)e)) 



jk 



x P 2 (l + (y-l)e)G 1 (P 1 (l + (y-l)e)) . (A.4) 



For expressions (A.3) and (A.4) to be equivalent, the 



following condition must be satisfied: 

P 2 ((l + (z-l)e)(l + (s/-l)e)) 

= P 2 (1 + (x - l)e) P 2 (1 + {y - l)e) . (A.5) 

We want to compare two networks sharing exactly the 
same degree distribution and degree correlation. Equa- 
tion (A.5 1 gives us the condition for which two networks 



with identical degree distributions, one featuring commu- 
nity structure and the other random topology, will have 
the same degree correlation. It is easy to conclude that 
the distribution of individuals per clique, in order to re- 
spect Eq. (A.5), can only be given by: 



Pn 



(A.6) 



where v is an arbitrary positive integer. In other words, 
all structures must be the same size. This limitation 
comes from the way we construct our random networks. 
Because by simply matching degrees generated from a 
given distribution, the knowledge of one neighbor's de- 
gree does not give any information concerning the other 
neighbor's degree. Note that Gq(z) and e are totally free, 
so that the heterogeneity of the degree distribution is not 
entirely compromised. 

We will now compare two networks with zero degree 
correlations. The first is random with p™ N = 5 n ,2 and 
£ ern _ ^ -vvliile the other exhibits community structure 
with p% s = 6 niV with v > 2 and e cs = e e [0,1]. The 
two networks have exactly the same degree distribution, 
which means that G% RN (z) = G% s (P cs (l + (z - l)e)). 
Using Eq. (28), we can easily write the epidemic thresh- 



old for the random network: 



A 



ERN 



1 



? Gf RN (l) e[(i/-2)+pi(*/-l)]' 



(A.7) 



where the last expression uses the PGFs of the structured 
network in which ^i = ^Gf s (l) is the mean number of 
excess cliques per individual. We will now insert expres- 
sion (A.7) in the epidemic threshold condition (26) of the 



network with community structure. Because all terms in 
the polynomial are positive, we expect to find an expres- 



sion greater than unity if (A.7) is higher than the thresh- 



old for CS, equal to one if the threshold remains the same 
or lesser than unity if the threshold for the ERN is ac- 
tually lower than that for CS. To prove the latter case, 
for arbitrary v, e and {<? m }, we simply demonstrate the 
following inequality written from (26) using (A.7): 



^ (v- 



D! 



[(^-2)+ Ml (j/-l)p < 1. (A. 



10 



Further, it can be shown that the derivative of (A.8) in 
fix is always positive. This provides us with an upper 
bound for (A.8) in the limit /i X — > oo. Using l'Hopital's 



rule, we thus find: 



lini 



Pifr-1)! 
^ («/-,•-!)! 



X 



[0-2)+ Ml (z/-l)] » = 1. (A.9) 



This indicates that the two networks with zero degree 
correlation, one featuring community structure and one 
an equivalent random network, will have the same thresh- 
old in the limit of infinite mean number of excess cliques 
per individual or if v = 2. Otherwise, because the deriva- 
tive of the polynomial in /ii was shown to be positive, 
finite Hi and v > 2 imply a higher threshold for the 
structured network. 
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